Detecting Local Audio-visual Synchrony in Monologues Utilizing Vocal Pitch and Facial Landmark Trajectories
نویسندگان
چکیده
Steven Cadavid1 [email protected] Mohamed Abdel-Mottaleb1 [email protected] Daniel S. Messinger2 [email protected] Mohammad H. Mahoor3 [email protected] Lorraine E. Bahrick4 [email protected] 1 University of Miami Department of Electrical and Computer Engineering 2 University of Miami Department of Electrical and Computer Engineering 3 University of Denver Department of Electrical and Computer Engineering 4 Florida International University Department of Psychology
منابع مشابه
Audio-visual synchrony for detection of monologues in video archives
In this paper we present our approach to detect monologues in video shots. A monologue shot is defined as a shot containing a talking person in the video channel with the corresponding speech in the audio channel. Whilst motivated by the TREC 2002 Video Retrieval Track (VT02), the underlying approach of synchrony between audio and video signals are also applicable for voice and face-based biome...
متن کاملAudio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap
It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated estimates of audio-visual temporal synchrony. However, it re...
متن کاملEfficient Melodic Query Based Audio Search for Hindustani Vocal Compositions
Time-series pattern matching methods that incorporate time warping have recently been used with varying degrees of success on tasks of search and discovery of melodic phrases from audio for Indian classical vocal music. While these methods perform effectively due to the minimal assumptions they place on the nature of the sampled pitch temporal trajectories, their practical applicability to retr...
متن کاملRobust audio-visual speech synchrony detection by generalized bimodal linear prediction
We study the problem of detecting audio-visual synchrony in video segments containing a speaker in frontal head pose. The problem holds a number of important applications, for example speech source localization, speech activity detection, speaker diarization, speech source separation, and biometric spoofing detection. In particular, we build on earlier work, extending our previously proposed ti...
متن کاملDetecting Depression from Facial Actions and Vocal Prosody Jeffrey
Current methods of assessing psychopathology depend almost entirely on verbal report (clinical interview or questionnaire) of patients, their family, or caregivers. They lack systematic and efficient ways of incorporating behavioral observations that are strong indicators of psychological disorder, much of which may occur outside the awareness of either individual. We compared clinical diagnosi...
متن کامل